fix: PgVectorDocumentStore _treat_meta_field and comparison functions now return Composed - string escaping done by psycopg#2964
Conversation
_treat_meta_field and comparison functions now return Composed - string escaping done by psycopg
anakin87
left a comment
There was a problem hiding this comment.
Nice work!
I left some comments
| assert _treat_meta_field(field="meta.name", value=None) == "meta->>'name'" | ||
| assert _treat_meta_field(field="meta.other", value={"a": 3, "b": "example"}) == SQL("meta->>") + SQLLiteral("other") | ||
| assert _treat_meta_field(field="meta.empty_list", value=[]) == SQL("meta->>") + SQLLiteral("empty_list") | ||
| assert _treat_meta_field(field="meta.name", value=None) == SQL("meta->>") + SQLLiteral("name") |
There was a problem hiding this comment.
Let's discuss tests: I understand that the return type has changed, so we need to do something like this, which, unfortunately, is hard to read/maintain.
An alternative would be to build a psycopg cursor and use it to convert SQL to str, keeping the test similar to the previous one. An example for test_treat_meta_field:
def test_treat_meta_field(document_store):
document_store._ensure_db_setup()
cursor = document_store._cursor
expr = _treat_meta_field(field="meta.number", value=9)
assert expr.as_string(cursor) == "(meta->>'number')::integer"
...The only problem with this approach: the test is no longer a pure unit test since it needs a DB connection. But we keep readability. WDYT?
There was a problem hiding this comment.
What about something like this:
def test_treat_meta_field():
cast_integer = SQL("(") + SQL("meta->>") + SQLLiteral("number") + SQL(")::integer")
no_cast_name = SQL("meta->>") + SQLLiteral("name")
cast_real = SQL("(") + SQL("meta->>") + SQLLiteral("number") + SQL(")::real")
cast_boolean = SQL("(") + SQL("meta->>") + SQLLiteral("bool") + SQL(")::boolean")
assert _treat_meta_field(field="meta.number", value=9) == cast_integer
assert _treat_meta_field(field="meta.number", value=[1, 2, 3]) == cast_integer
assert _treat_meta_field(field="meta.name", value="my_name") == no_cast_name
assert _treat_meta_field(field="meta.name", value=["my_name"]) == no_cast_name
assert _treat_meta_field(field="meta.number", value=1.1) == cast_real
assert _treat_meta_field(field="meta.number", value=[1.1, 2.2]) == cast_real
assert _treat_meta_field(field="meta.bool", value=True) == cast_boolean
assert _treat_meta_field(field="meta.bool", value=[True, False]) == cast_booleanit compares two Composed instances
There was a problem hiding this comment.
This test remains quite understandable in any case.
test_comparison_condition_missing_operatortest_logical_condition_nested is hard to read instead.
For this reason, I was advocating to resolve to string in the test.
What's your opinion?
There was a problem hiding this comment.
I'm confused.
To make sure I understand your point.
You agree with the approach for the test test_treat_meta_field as is, but instead of repr() you suggest having it as a string to compare the full output - is that it?
Regarding the test_comparison_condition_missing_operator- it's not changed by this PR - but I miss what you are suggesting.
There was a problem hiding this comment.
Sorry, I was pointing to the wrong test. It's test_logical_condition_nested that gets a bit hard to read.
If we want to resolve to a string for easier comparison, we can use a cursor as I proposed in #2964 (comment)
In any case, I'll leave this decision up to you.
anakin87
left a comment
There was a problem hiding this comment.
Looks good.
Feel free to take #2964 (comment) into consideration or disregard it if you think that tests are sufficiently readable
|
Ah! I see what you mean now. I can fix the readability of that test as well. |
Related Issues
PGVectorDocumentStorefilters to prevent SQL injection vectors. #2881Proposed Changes:
_treat_meta_fieldreturns apsycopg.sql.Composedobject and usingpsycopg.sql.Literalto embed the field name, so string escaping is now done by psycopg itselfHow did you test it?
SQLLiteralChecklist
fix:,feat:,build:,chore:,ci:,docs:,style:,refactor:,perf:,test:.